Spectral Products¶

Spectral products are created using mathematical combinations of specific bands (wavelength components). These spectral products can be useful for identifying spatial variations in vegetation, water or urbanization. This notebook covers the following spectral products: Fractional Cover, NDBI, NDVI, NDWI, SAVI, EVI

In [1]:
import datacube
dc = datacube.Datacube(app='Spectral_Products')

import sys, os
os.environ['USE_PYGEOS'] = '0'

import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np  
import xarray as xr  

from datacube.utils import masking
from dea_tools.plotting import rgb, display_map

### EASI tools
sys.path.append(os.path.expanduser('../scripts'))
from ceos_utils.data_cube_utilities.clean_mask import landsat_clean_mask_invalid, landsat_qa_clean_mask
from easi_tools import EasiDefaults
from easi_tools import notebook_utils
easi = EasiDefaults() # Get the default parameters for this system
Successfully found configuration for deployment "eail"
In [2]:
cluster, client = notebook_utils.initialize_dask(use_gateway=False)
display(cluster if cluster else client)
print(notebook_utils.localcluster_dashboard(client, server=easi.hub))
2023-05-19 20:29:46,135 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/dask-worker-space/worker-ozw8axqt', purging

LocalCluster

bb35f62e

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-933d005c-c5e9-4a98-9774-ee61689a54be

Comm: tcp://127.0.0.1:42955 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: Just now Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:34651 Total threads: 2
Dashboard: http://127.0.0.1:41277/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36755
Local directory: /tmp/dask-worker-space/worker-35xe90mx

Worker: 1

Comm: tcp://127.0.0.1:33655 Total threads: 2
Dashboard: http://127.0.0.1:41057/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:33001
Local directory: /tmp/dask-worker-space/worker-faiuvik7

Worker: 2

Comm: tcp://127.0.0.1:38429 Total threads: 2
Dashboard: http://127.0.0.1:40523/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:37511
Local directory: /tmp/dask-worker-space/worker-bcjtwgt3

Worker: 3

Comm: tcp://127.0.0.1:35487 Total threads: 2
Dashboard: http://127.0.0.1:36659/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:32827
Local directory: /tmp/dask-worker-space/worker-122k23a0
https://hub.eail.easi-eo.solutions/user/jhodge/proxy/8787/status
In [3]:
from datacube.utils.aws import configure_s3_access
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client)
Out[3]:
<botocore.credentials.Credentials at 0x7fc0ad82acb0>
In [4]:
# Select a Product and Platform
product = "s2_l2a"
platform = "Sentinel-2A"
In [5]:
# Select an analysis region (Lat-Lon) within the extents listed above. 
# Select a time period (Min-Max) within the extents listed above (Year-Month-Day)
# This region and time period will be used for the cloud assessment

# South Santo, Sanma - Main Island, Vanuatu
# Path of TS Harold, April 6, 2020
# latitude = (-15.682, -15.491) 
# longitude = (166.736, 166.904) 

# Efate Island, Vanuatu (near Port Vila)
# latitude = (-17.810, -17.691) 
# longitude = (168.247, 168.349) 

# South Pentacost - Vanuatu
# Path of TS Harold, April 7, 2020
# latitude = (-16.040, -15.778) 
# longitude = (168.158, 168.291) 

# West Majuro - Marshall Islands
# latitude = (7.132, 7.169) 
# longitude = (171.023, 171.048) 

# Kumasi, Ghana
# latitude = (6.529, 6.8734) 
# longitude = (-1.7954, -1.4211) 

# Tonga - Main Island
# latitude = (-21.29, -21.03) 
# longitude = (-175.37, -175.02) 

# Hornsby, AUS - High Frog Density
# latitude = (-33.8, -33.7) 
# longitude = (151.04, 151.16) 

# Niue Island
# latitude = (-19.1743, -18.9264) 
# longitude = (-169.9694, -169.7523) 

latitude = easi.latitude
longitude = easi.longitude

# Time Period (Year-Mon-Day)
time_extents = ('2021-03-01', '2021-06-01')
In [6]:
# The code below renders a map that can be used to view the region.
display_map(longitude, latitude)
/env/lib/python3.10/site-packages/dea_tools/plotting.py:313: FutureWarning: This function is deprecated. See: https://pyproj4.github.io/pyproj/stable/gotchas.html#upgrading-to-pyproj-2-from-pyproj-1
  all_longitude, all_latitude = transform(Proj(crs), Proj('EPSG:4326'), all_x,
Out[6]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Load the dataset and the required spectral bands or other parameters¶

In [7]:
sentinel_dataset = dc.load(latitude = latitude,
                          longitude = longitude,
                          time = time_extents,
                          product = product,
                          group_by='solar_day',
                          measurements = ['red', 'green', 'blue', 'nir', 'swir_1', 'swir_2', 'SCL'],
                          output_crs = 'EPSG:6933',
                          resolution = (-30,30),
                          dask_chunks = {'time':1}
                          ) 
In [8]:
# Change the names to work with this notebook and other ceos_utils functions
sentinel_dataset = sentinel_dataset.rename(
    {
        'swir_1':'swir1', 
        'swir_2':'swir2', 
        'SCL':'scl',
    }
) 
In [9]:
sentinel_dataset
Out[9]:
<xarray.Dataset>
Dimensions:      (time: 19, y: 342, x: 322)
Coordinates:
  * time         (time) datetime64[ns] 2021-03-01T16:02:51 ... 2021-05-30T16:...
  * y            (y) float64 4.418e+06 4.418e+06 ... 4.408e+06 4.408e+06
  * x            (x) float64 -7.386e+06 -7.386e+06 ... -7.376e+06 -7.376e+06
    spatial_ref  int32 6933
Data variables:
    red          (time, y, x) uint16 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
    green        (time, y, x) uint16 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
    blue         (time, y, x) uint16 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
    nir          (time, y, x) uint16 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
    swir1        (time, y, x) uint16 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
    swir2        (time, y, x) uint16 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
    scl          (time, y, x) uint8 dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
Attributes:
    crs:           EPSG:6933
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 19
    • y: 342
    • x: 322
    • time
      (time)
      datetime64[ns]
      2021-03-01T16:02:51 ... 2021-05-...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-03-01T16:02:51.000000000', '2021-03-06T16:02:52.000000000',
             '2021-03-11T16:02:52.000000000', '2021-03-16T16:02:52.000000000',
             '2021-03-21T16:02:51.000000000', '2021-03-26T16:02:50.000000000',
             '2021-03-31T16:02:51.000000000', '2021-04-05T16:02:48.000000000',
             '2021-04-10T16:02:49.000000000', '2021-04-15T16:02:46.000000000',
             '2021-04-20T16:02:47.000000000', '2021-04-25T16:02:49.000000000',
             '2021-04-30T16:02:48.000000000', '2021-05-05T16:02:51.000000000',
             '2021-05-10T16:02:51.000000000', '2021-05-15T16:02:52.000000000',
             '2021-05-20T16:02:52.000000000', '2021-05-25T16:02:53.000000000',
             '2021-05-30T16:02:53.000000000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      4.418e+06 4.418e+06 ... 4.408e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      EPSG:6933
      array([4418325., 4418295., 4418265., ..., 4408155., 4408125., 4408095.])
    • x
      (x)
      float64
      -7.386e+06 ... -7.376e+06
      units :
      metre
      resolution :
      30.0
      crs :
      EPSG:6933
      array([-7386015., -7385985., -7385955., ..., -7376445., -7376415., -7376385.])
    • spatial_ref
      ()
      int32
      6933
      spatial_ref :
      PROJCS["WGS 84 / NSIDC EASE-Grid 2.0 Global",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Cylindrical_Equal_Area"],PARAMETER["standard_parallel_1",30],PARAMETER["central_meridian",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6933"]]
      grid_mapping_name :
      lambert_cylindrical_equal_area
      array(6933, dtype=int32)
    • red
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.99 MiB 215.09 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      322 342 19
    • green
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.99 MiB 215.09 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      322 342 19
    • blue
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.99 MiB 215.09 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      322 342 19
    • nir
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.99 MiB 215.09 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      322 342 19
    • swir1
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.99 MiB 215.09 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      322 342 19
    • swir2
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 3.99 MiB 215.09 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      322 342 19
    • scl
      (time, y, x)
      uint8
      dask.array<chunksize=(1, 342, 322), meta=np.ndarray>
      units :
      1
      nodata :
      0
      flags_definition :
      {'qa': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'0': 'no data', '1': 'saturated or defective', '2': 'dark area pixels', '3': 'cloud shadows', '4': 'vegetation', '5': 'bare soils', '6': 'water', '7': 'unclassified', '8': 'cloud medium probability', '9': 'cloud high probability', '10': 'thin cirrus', '11': 'snow or ice'}, 'description': 'Sen2Cor Scene Classification'}}
      crs :
      EPSG:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 2.00 MiB 107.54 kiB
      Shape (19, 342, 322) (1, 342, 322)
      Dask graph 19 chunks in 1 graph layer
      Data type uint8 numpy.ndarray
      322 342 19
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-03-01 16:02:51', '2021-03-06 16:02:52',
                     '2021-03-11 16:02:52', '2021-03-16 16:02:52',
                     '2021-03-21 16:02:51', '2021-03-26 16:02:50',
                     '2021-03-31 16:02:51', '2021-04-05 16:02:48',
                     '2021-04-10 16:02:49', '2021-04-15 16:02:46',
                     '2021-04-20 16:02:47', '2021-04-25 16:02:49',
                     '2021-04-30 16:02:48', '2021-05-05 16:02:51',
                     '2021-05-10 16:02:51', '2021-05-15 16:02:52',
                     '2021-05-20 16:02:52', '2021-05-25 16:02:53',
                     '2021-05-30 16:02:53'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([4418325.0, 4418295.0, 4418265.0, 4418235.0, 4418205.0, 4418175.0,
                    4418145.0, 4418115.0, 4418085.0, 4418055.0,
                    ...
                    4408365.0, 4408335.0, 4408305.0, 4408275.0, 4408245.0, 4408215.0,
                    4408185.0, 4408155.0, 4408125.0, 4408095.0],
                   dtype='float64', name='y', length=342))
    • x
      PandasIndex
      PandasIndex(Float64Index([-7386015.0, -7385985.0, -7385955.0, -7385925.0, -7385895.0,
                    -7385865.0, -7385835.0, -7385805.0, -7385775.0, -7385745.0,
                    ...
                    -7376655.0, -7376625.0, -7376595.0, -7376565.0, -7376535.0,
                    -7376505.0, -7376475.0, -7376445.0, -7376415.0, -7376385.0],
                   dtype='float64', name='x', length=322))
  • crs :
    EPSG:6933
    grid_mapping :
    spatial_ref

Mask out clouds and cloud shadows + water (if desired) and create a median mosaic¶

In [10]:
cloud_mask = (sentinel_dataset.scl != 0) & (sentinel_dataset.scl != 1) & \
             (sentinel_dataset.scl != 3) & (sentinel_dataset.scl != 8) & \
             (sentinel_dataset.scl != 9) & (sentinel_dataset.scl != 10)
land_mask =  ((sentinel_dataset.scl != 6) & cloud_mask)

# Land and Water Dataset = Land and Water pixels with NO Clouds and NO Cloud Shadows
land_and_water_dataset = sentinel_dataset.where(cloud_mask)

# Land Dataset = Land ONLY pixels with NO Clouds, NO Cloud Shadows and NO Water pixels
land_dataset = sentinel_dataset.where(land_mask)

from ceos_utils.data_cube_utilities.dc_mosaic import create_median_mosaic, create_max_ndvi_mosaic, create_hdmedians_multiple_band_mosaic
In [11]:
# Run a Dask compute() to load the data to memory. This will make the rest of the process quicker.
land_dataset = land_dataset.compute()
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
In [12]:
# Select a compositing method to create your cloud-filtered mosaic
# Remove the comments from the pair of lines under one of the mosaic types
# Options are: Median or Max_NDVI 

# This is the MEDIAN mosaic
land_and_water_composite = create_median_mosaic(land_and_water_dataset, cloud_mask)
land_composite = create_median_mosaic(land_dataset, land_mask)
cloud_mask_composite = cloud_mask.max('time')

# This is the MAX_NDVI mosaic
# land_and_water_composite = create_max_ndvi_mosaic(land_and_water_dataset, cloud_mask)
# land_composite = create_max_ndvi_mosaic(land_dataset, land_mask)
In [13]:
land_composite
Out[13]:
<xarray.Dataset>
Dimensions:      (y: 342, x: 322)
Coordinates:
  * y            (y) float64 4.418e+06 4.418e+06 ... 4.408e+06 4.408e+06
  * x            (x) float64 -7.386e+06 -7.386e+06 ... -7.376e+06 -7.376e+06
    spatial_ref  int32 6933
Data variables:
    red          (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    green        (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    blue         (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    nir          (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    swir1        (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    swir2        (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    scl          (y, x) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
xarray.Dataset
    • y: 342
    • x: 322
    • y
      (y)
      float64
      4.418e+06 4.418e+06 ... 4.408e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      EPSG:6933
      array([4418325., 4418295., 4418265., ..., 4408155., 4408125., 4408095.])
    • x
      (x)
      float64
      -7.386e+06 ... -7.376e+06
      units :
      metre
      resolution :
      30.0
      crs :
      EPSG:6933
      array([-7386015., -7385985., -7385955., ..., -7376445., -7376415., -7376385.])
    • spatial_ref
      ()
      int32
      6933
      spatial_ref :
      PROJCS["WGS 84 / NSIDC EASE-Grid 2.0 Global",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Cylindrical_Equal_Area"],PARAMETER["standard_parallel_1",30],PARAMETER["central_meridian",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6933"]]
      grid_mapping_name :
      lambert_cylindrical_equal_area
      array(6933, dtype=int32)
    • red
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • green
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • blue
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • nir
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • swir1
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • swir2
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • scl
      (y, x)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • y
      PandasIndex
      PandasIndex(Float64Index([4418325.0, 4418295.0, 4418265.0, 4418235.0, 4418205.0, 4418175.0,
                    4418145.0, 4418115.0, 4418085.0, 4418055.0,
                    ...
                    4408365.0, 4408335.0, 4408305.0, 4408275.0, 4408245.0, 4408215.0,
                    4408185.0, 4408155.0, 4408125.0, 4408095.0],
                   dtype='float64', name='y', length=342))
    • x
      PandasIndex
      PandasIndex(Float64Index([-7386015.0, -7385985.0, -7385955.0, -7385925.0, -7385895.0,
                    -7385865.0, -7385835.0, -7385805.0, -7385775.0, -7385745.0,
                    ...
                    -7376655.0, -7376625.0, -7376595.0, -7376565.0, -7376535.0,
                    -7376505.0, -7376475.0, -7376445.0, -7376415.0, -7376385.0],
                   dtype='float64', name='x', length=322))
In [14]:
# RGB image options
# Standard RGB = 321 = Red, Green, Blue
# False Color = 543 = SWIR1, NIR, Red
# False Color (Sentinel-2 Mosaic) = 742 = SWIR2, NIR, Green

rgb(land_composite, bands=['swir2', 'nir', 'green'], robust=True, size=8)
plt.axis('off')
plt.show()
/env/lib/python3.10/site-packages/numpy/lib/nanfunctions.py:1217: RuntimeWarning: All-NaN slice encountered
  r, k = function_base._ureduce(a, func=_nanmedian, axis=axis, out=out,

Land Spectral Indices¶

In [15]:
def NDBI(dataset):
        return (dataset.swir1 - dataset.nir)/(dataset.swir1 + dataset.nir)
In [16]:
def NDVI(dataset):
    return (dataset.nir - dataset.red)/(dataset.nir + dataset.red)
In [17]:
def NDWI(dataset):
    return (dataset.green - dataset.nir)/(dataset.green + dataset.nir)
In [18]:
def SAVI(dataset):
        return (dataset.nir - dataset.red)/(dataset.nir + dataset.red + 0.5)*1.5
In [19]:
def EVI(dataset):
        return 2.5*(dataset.nir - dataset.red)/(dataset.nir + 6.0*dataset.red - 7.5*dataset.blue + 1.0)
In [20]:
# Water pixels masked (land only)
ndbi = NDBI(land_composite)  # Normalized Difference Build Up (Urbanization) Index
ndvi = NDVI(land_composite)  # Normalized Difference Vegetation Index
ndwi = NDWI(land_composite) # Normalized Difference Water Index

# Water pixels not masked
ndbi2 = NDBI(land_and_water_composite)  # Normalized Difference Build Up (Urbanization) Index
ndvi2 = NDVI(land_and_water_composite)  # Normalized Difference Vegetation Index
ndwi2 = NDWI(land_and_water_composite) # Normalized Difference Water Index

# Other Indices
savi = SAVI(land_composite)  # Soil Adjusted Vegetation Index 
evi = EVI(land_composite) # Enhanced Vegetation Index
In [21]:
ds_ndvi = ndvi2.to_dataset(name = "NDVI")
ds_ndwi = ndwi2.to_dataset(name=  "NDWI")
ds_ndbi = ndbi2.to_dataset(name = "NDBI")
normalization_dataset = ds_ndvi.merge(ds_ndwi).merge(ds_ndbi)

Land Fractional Cover¶

Fractional Cover (FC) is used for landcover type estimation (vegetation, non-green vegetation, bare soil) of each pixel. We use a model from CSIRO (Juan Gerschmann) and apply it to a median mosaic where: Bare Soil = bs, Photosynthetic Vegetation = pv, and Non-Photosynthetic Vegetation = npv. The product is a False Color RGB result where RGB = bs/pv/npv.

In [22]:
land_composite = land_composite.rename_dims(
    {
        'x': 'latitude', # This is wrong because it is in northings/eastings, but the function below requires 'latitude'
        'y': 'longitude' # This is wrong because it is in northings/eastings, but the function below requires 'longitude'
    }
)
land_composite
Out[22]:
<xarray.Dataset>
Dimensions:      (longitude: 342, latitude: 322)
Coordinates:
  * y            (longitude) float64 4.418e+06 4.418e+06 ... 4.408e+06 4.408e+06
  * x            (latitude) float64 -7.386e+06 -7.386e+06 ... -7.376e+06
    spatial_ref  int32 6933
Dimensions without coordinates: longitude, latitude
Data variables:
    red          (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    green        (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    blue         (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    nir          (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    swir1        (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    swir2        (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
    scl          (longitude, latitude) float64 dask.array<chunksize=(342, 322), meta=np.ndarray>
xarray.Dataset
    • longitude: 342
    • latitude: 322
    • y
      (longitude)
      float64
      4.418e+06 4.418e+06 ... 4.408e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      EPSG:6933
      array([4418325., 4418295., 4418265., ..., 4408155., 4408125., 4408095.])
    • x
      (latitude)
      float64
      -7.386e+06 ... -7.376e+06
      units :
      metre
      resolution :
      30.0
      crs :
      EPSG:6933
      array([-7386015., -7385985., -7385955., ..., -7376445., -7376415., -7376385.])
    • spatial_ref
      ()
      int32
      6933
      spatial_ref :
      PROJCS["WGS 84 / NSIDC EASE-Grid 2.0 Global",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Cylindrical_Equal_Area"],PARAMETER["standard_parallel_1",30],PARAMETER["central_meridian",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6933"]]
      grid_mapping_name :
      lambert_cylindrical_equal_area
      array(6933, dtype=int32)
    • red
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • green
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • blue
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • nir
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • swir1
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • swir2
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • scl
      (longitude, latitude)
      float64
      dask.array<chunksize=(342, 322), meta=np.ndarray>
      Array Chunk
      Bytes 860.34 kiB 860.34 kiB
      Shape (342, 322) (342, 322)
      Dask graph 1 chunks in 27 graph layers
      Data type float64 numpy.ndarray
      322 342
    • y
      PandasIndex
      PandasIndex(Float64Index([4418325.0, 4418295.0, 4418265.0, 4418235.0, 4418205.0, 4418175.0,
                    4418145.0, 4418115.0, 4418085.0, 4418055.0,
                    ...
                    4408365.0, 4408335.0, 4408305.0, 4408275.0, 4408245.0, 4408215.0,
                    4408185.0, 4408155.0, 4408125.0, 4408095.0],
                   dtype='float64', name='y', length=342))
    • x
      PandasIndex
      PandasIndex(Float64Index([-7386015.0, -7385985.0, -7385955.0, -7385925.0, -7385895.0,
                    -7385865.0, -7385835.0, -7385805.0, -7385775.0, -7385745.0,
                    ...
                    -7376655.0, -7376625.0, -7376595.0, -7376565.0, -7376535.0,
                    -7376505.0, -7376475.0, -7376445.0, -7376415.0, -7376385.0],
                   dtype='float64', name='x', length=322))
NOTE: for some reason this doesn't work properly and the Fractional cover image gets messed up. There is something going wrong with the handling of arrays. I think it would be better if the function worked in pure xarray rather than numpy arrays as there are too many possiblities for losing the relationships with dimensions.
In [23]:
from ceos_utils.data_cube_utilities.dc_fractional_coverage_classifier import frac_coverage_classify 
# frac_classes = frac_coverage_classify(land_composite, clean_mask = 
#                                       np.ones(land_composite.scl.shape).astype(np.bool)) 

# For some reason, the order of the clean_mask shape is incorrect. It has been reversed here just to avoid an error, but there is a deeper issue.
frac_classes = frac_coverage_classify(land_composite, clean_mask = 
                                      np.ones((land_composite.scl.shape[1],land_composite.scl.shape[0])).astype(np.bool)) 
/tmp/ipykernel_491/1102295258.py:7: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.ones((land_composite.scl.shape[1],land_composite.scl.shape[0])).astype(np.bool))
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/numpy/lib/nanfunctions.py:1217: RuntimeWarning: All-NaN slice encountered
  r, k = function_base._ureduce(a, func=_nanmedian, axis=axis, out=out,
In [24]:
# Plot of Fractional Cover
# RED = Bare Soil or Urban Areas
# BLUE = Non-Green Vegetation
# GREEN = Green Vegetation
# BLACK = Water

# Plot of RGB = NDBI-NDVI-NDWI
# RED = Bare Soil or Urban Areas
# GREEN = Vegetation
# BLUE = Water

fig, ax = plt.subplots(1, 2, figsize=(16, 10))

fc_rgb = frac_classes[['bs', 'pv', 'npv']].to_array()
ndi_rgb = normalization_dataset[['NDBI','NDVI','NDWI']].to_array()

fc_rgb.plot.imshow(ax=ax[0], vmin=0.0, vmax=100.0)
ndi_rgb.plot.imshow(ax=ax[1], vmin=-1.0, vmax=1.0)

ax[0].set_title('Fractional Cover'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title('Normalized Difference RGB'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
plt.show()
In [25]:
%pylab inline
pylab.rcParams['figure.figsize'] = (10, 10)
%pylab is deprecated, use %matplotlib inline and import the required libraries.
Populating the interactive namespace from numpy and matplotlib
/env/lib/python3.10/site-packages/IPython/core/magics/pylab.py:162: UserWarning: pylab import has clobbered these variables: ['product']
`%matplotlib` prevents importing * from pylab and numpy
  warn("pylab import has clobbered these variables: %s"  % clobbered +
In [26]:
# Create a custom colour map for NDVI
# Water (blue) = NDVI -1.0 to 0.05
# Urban or Bare Soil (brown) = NDVI 0.05 to 0.25
# Low Vegetation (tan) = NDVI 0.25 to 0.4
# Croplands (light green) = NDVI 0.4 to 0.6
# Dense Vegetation / Forests (dark green) = NDVI 0.6 to 1.0

ndvi_cmap = mpl.colors.ListedColormap(['blue', '#a52a2a','#ffffcc' ,  '#2eb82e',  '#006600'])
ndvi_bounds = [-1, 0.05, 0.25,  0.4,  0.6, 1]
ndvi_norm = mpl.colors.BoundaryNorm(ndvi_bounds, ndvi_cmap.N)
In [27]:
fig, ax = plt.subplots(1, 2, figsize=(14, 8))
ax[0].imshow(ndvi2, cmap="Greens", vmin=-1.0, vmax=1.0)
ax[1].imshow(ndvi2, cmap=ndvi_cmap, norm = ndvi_norm)
ax[0].set_title('NDVI Basic'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title('NDVI Custom Colormap'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
plt.show()
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
In [28]:
# EVI and SAVI indices using "land only" pixels
fig, ax = plt.subplots(1, 2, figsize=(18, 8))
(evi).plot.imshow(ax=ax[0], cmap="Greens", vmin=0.0, vmax=2.5)
(savi).plot.imshow(ax=ax[1], cmap="Greens", vmin=-1.0, vmax=1.0)
ax[0].set_title('EVI'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title('SAVI'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
plt.show()
In [29]:
# OTHERS
fig, ax = plt.subplots(1, 2, figsize=(18, 8))
(ndvi).plot.imshow(ax=ax[0], cmap="Greens", vmin=0.0, vmax=1.0)
(savi).plot.imshow(ax=ax[1], cmap="viridis", vmin=0.0, vmax=1.0)
ax[0].set_title('NDVI-Greens'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title('NDVI-Viridis'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
plt.show()

Create a threshold plot¶

First we will define a minimum threshold and a maximum threshold. Then you will create a plot that colors the region between the threshold a single color (e.g. red) and the region outside the threshold will be BLACK or WHITE. Also, we will calculate the % of pixels and the number of pixels in the threshold range.

In [30]:
from matplotlib.ticker import FuncFormatter

def threshold_plot(da, min_threshold, max_threshold, mask = None, width = 10, *args, **kwargs): 
    color_in    = np.array([255,0,0])
    color_out   = np.array([0,0,0])
    color_cloud = np.array([255,255,255])
    
    array = np.zeros((*da.values.shape, 3)).astype(np.int16)
    
    inside  = np.logical_and(da.values > min_threshold, da.values < max_threshold)
    outside = np.invert(inside)
    masked  = np.zeros(da.values.shape).astype(bool) if mask is None else mask
    
    array[inside] =  color_in
    array[outside] = color_out
    array[masked] =  color_cloud

    def figure_ratio(ds, fixed_width = 10):
        width = fixed_width
        height = len(ds.latitude) * (fixed_width / len(ds.longitude))
        return (width, height)


    fig, ax = plt.subplots(figsize = figure_ratio(da,fixed_width = width))
    
    lat_formatter = FuncFormatter(lambda y_val, tick_pos: "{0:.3f}".format(da.latitude.values[tick_pos] ))
    lon_formatter = FuncFormatter(lambda x_val, tick_pos: "{0:.3f}".format(da.longitude.values[tick_pos]))

    ax.xaxis.set_major_formatter(lon_formatter)
    ax.yaxis.set_major_formatter(lat_formatter)
    
    plt.title("Threshold: {} < x < {}".format(min_threshold, max_threshold))
    plt.xlabel('Longitude')
    plt.ylabel('Latitude')
    
    plt.imshow(array, *args, **kwargs)
    plt.axis('off')
    plt.show()
In [31]:
# Select a threshold range for your spectral variable and generate a plot
# Remove comments from the set of 3 lines for your desired variable
# Variable choices are NDBI, NDVI, EVI, FC-Bare Soil, FC-Photosynthetic Vegetation

# NDBI (Buildup Index) = -1.0 to 1.0 (full range)
# NDBI 0.0 to 0.2 is typical for urban areas
# -----------------------
# minimum_threshold = 0.0
# maximum_threshold = 0.3
# threshold_plot(ndbi, minimum_threshold, maximum_threshold, width = 8)

# NDVI (Vegetation Index) = -1.0 to 1.0
# NDVI < 0.0 = non-vegetation (bare soil)
# NDVI 0.2 to 0.6 = grasslands
# NDVI 0.6 to 0.9 = dense vegetation / trees
# -----------------------
# minimum_threshold = 0.05
# maximum_threshold = 0.25
# threshold_plot(ndvi, minimum_threshold, maximum_threshold, width = 8)

# EVI (Vegetation Index) = -1.0 to 2.5
# EVI 2.0 to 2.5 is typical for dense vegetation
# -----------------------
# minimum_threshold = 2.0
# maximum_threshold = 2.5
# threshold_plot(evi, minimum_threshold, maximum_threshold, width = 8)

# Fractional Cover (pv,npv,bs) = 0 to 100
# Bare Soil (bs) >40 = urbanization / bare soil
# ----------------------
# minimum_threshold = 20.0
# maximum_threshold = 100.0
# threshold_plot(frac_classes.bs, minimum_threshold, maximum_threshold, width = 8)

# Fractional Cover (pv,npv,bs) = 0 to 100
# Vegetation (pv) >80 = dense green vegetation
# ----------------------
minimum_threshold = 80.0
maximum_threshold = 100.0
threshold_plot(frac_classes.pv, minimum_threshold, maximum_threshold, width = 8)
In [32]:
def threshold_count(da, min_threshold, max_threshold, mask = None):
    def count_not_nans(arr):
        return np.count_nonzero(~np.isnan(arr))
    
    in_threshold = np.logical_and( da.values > min_threshold, da.values < max_threshold)
    
    total_non_cloudy = count_not_nans(da.values) if mask is None else np.sum(mask.values)
    
    return dict(total = np.size(da.values),
                total_non_cloudy = total_non_cloudy,
                inside = np.nansum(in_threshold),
                outside = total_non_cloudy - np.nansum(in_threshold)
               )    
    
def threshold_percentage(da, min_threshold, max_threshold, mask = None):
    counts = threshold_count(da, min_threshold, max_threshold, mask = mask)
    return dict(percent_inside_threshold = (counts["inside"]   / counts["total"]) * 100.0,
                percent_outside_threshold = (counts["outside"] / counts["total"]) * 100.0,
                percent_clouds = ( 100.0-counts["total_non_cloudy"] / counts["total"] * 100.0))
In [33]:
# Select a threshold statistical function that matches your land spectral variable
# COUNT = number of pixels in each category
# PERCENTAGE = percent of pixels in each category
# ---------------------------------

# NDBI Threshold
# threshold_count(ndbi,minimum_threshold,maximum_threshold, cloud_mask_composite)
# threshold_percentage(ndbi,minimum_threshold,maximum_threshold)

# NDVI Threshold
# threshold_count(ndvi,minimum_threshold,maximum_threshold)
# threshold_percentage(ndvi,minimum_threshold,maximum_threshold)

# EVI Threshold
# threshold_count(evi,minimum_threshold,maximum_threshold)
# threshold_percentage(evi,minimum_threshold,maximum_threshold)

# Fractional Cover - Bare Soil
# threshold_count(frac_classes.bs, minimum_threshold, maximum_threshold)
# threshold_percentage(frac_classes.bs, minimum_threshold, maximum_threshold)

# Fractional Cover - Photosynthetic Vegetation
threshold_count(frac_classes.pv, minimum_threshold, maximum_threshold)
# threshold_percentage(frac_classes.pv, minimum_threshold, maximum_threshold)
Out[33]:
{'total': 110124,
 'total_non_cloudy': 110124,
 'inside': 1947,
 'outside': 108177}
In [34]:
# threshold_percentage(ndbi,minimum_threshold,maximum_threshold)
threshold_percentage(frac_classes.pv, minimum_threshold, maximum_threshold)
Out[34]:
{'percent_inside_threshold': 1.7680069739566306,
 'percent_outside_threshold': 98.23199302604337,
 'percent_clouds': 0.0}

GeoTIFF Output Products¶

In [35]:
from ceos_utils.data_cube_utilities.dc_utilities import write_geotiff_from_xr

# Remove the comment to create a GeoTIFF output product
# Change the name of the output file, or it will be overwritten for each run 
# Change the desired bands at the end of the function

# Fractional Coverage
# write_geotiff_from_xr("geotiffs/frac_classes.tif", frac_classes, bands=['bs'])

# NDVI
# write_geotiff_from_xr("geotiffs/ndvi_land.tif", ndvi)

# EVI
# write_geotiff_from_xr("geotiffs/evi.tif", evi)

# WOFS
# write_geotiff_from_xr("geotiffs/wofs.tif", water_classification.wofs)
In [36]:
# !ls -lah geotiffs/*.tif